Whispered Speech Detection Using Glottal Flow-Based Features
نویسندگان
چکیده
Recent studies have reported that the performance of Automatic Speech Recognition (ASR) technologies designed for normal speech notably deteriorates when it is evaluated by whispered speech. Therefore, detection useful in order to attenuate mismatch between training and testing situations. This paper proposes two new Glottal Flow (GF)-based features, namely, GF-based Mel-Frequency Cepstral Coefficient (GF-MFCC) as a magnitude-based feature relative phase (GF-RP) phase-based detection. The main contribution proposed features extract magnitude information obtained GF signal. In GF-MFCC, Mel-frequency cepstral coefficient (MFCC) extraction modified using estimated signal derived from iterative adaptive inverse filtering input replace raw similar way, GF-RP modification (RP) instead production provides lower amplitude glottal source than production, thus, via Discrete Fourier Transformation (DFT) information, which make different hypothesized types our are addition, individual GF-MFCC/GF-RP feature, feature-level score-level combination also further improve performance. combinations this study investigated CHAIN corpus. GF-MFCC outperforms MFCC, while has higher RP. Further improved results MFCC (MFCC&GF-MFCC)/RP GF-RP(RP&GF-RP) compared with either one alone. combined score MFCC&GF-MFCC RP&GF-RP gives best frame-level accuracy 95.01% utterance-level 100%.
منابع مشابه
Fractal Characteristic-Based Endpoint Detection for Whispered Speech
In this paper, a fractal based approach is proposed to detect endpoints in whispered speech. The underlying principle is based on the fact that whispered speech is sufficiently chaotic and thus can be analyzed using fractal theory. Due to the different scope of fractal dimensions of silence, noise and speech segment, speech/non-speech segment could be determined from that with a simple decision...
متن کاملGlottal Source Features for Automatic Speech-Based Depression Assessment
Depression is one of the most prominent mental disorders, with an increasing rate that makes it the fourth cause of disability worldwide. The field of automated depression assessment has emerged to aid clinicians in the form of a decision support system. Such a system could assist as a pre-screening tool, or even for monitoring high risk populations. Related work most commonly involves multimod...
متن کاملSVM-based speech endpoint detection using contextual speech features
Shown is an effective speech endpoint detection algorithm using a trained support vector machine (SVM) and a feature vector including contextual information speech features. With this and other innovations the proposed algorithm yields high discrimination and reports significant improvements over standard methods and algorithms defining the decision rule in terms of averaged subband speech feat...
متن کاملClassification-Based Detection of Glottal Closure Instants from Speech Signals
In this paper a classification-based method for the automatic detection of glottal closure instants (GCIs) from the speech signal is proposed. Peaks in the speech waveforms are taken as candidates for GCI placements. A classification framework is used to train a classification model and to classify whether or not a peak corresponds to the GCI. We show that the detection accuracy in terms of F1 ...
متن کاملThe Mechanical Design of Drowsiness Detection Using Color Based Features
This paper demonstrates design and fabrication o f a mechatronic system for human drowsiness detection. This system can be used in multiple places. For example, in factories, it is used on some dangerous machinery and in cars in order t o prevent the operator o r driver from falling asleep. This system is composed of three parts: (1) mechanical, (2) electrical and (3) image processing system. A...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Symmetry
سال: 2022
ISSN: ['0865-4824', '2226-1877']
DOI: https://doi.org/10.3390/sym14040777